Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model

نویسندگان

  • Aaron L. F. Han
  • Xiaodong Zeng
  • Derek F. Wong
  • Lidia S. Chao
چکیده

Named entity recognition (NER) plays an important role in the NLP literature. The traditional methods tend to employ large annotated corpus to achieve a high performance. Different with many semi-supervised learning models for NER task, in this paper, we employ the graph-based semi-supervised learning (GBSSL) method to utilize the freely available unlabeled data. The experiment shows that the unlabeled corpus can enhance the state-of-theart conditional random field (CRF) learning model and has potential to improve the tagging accuracy even though the margin is a little weak and not satisfying in current experiments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graph-based Semi-supervised Gene Mention Tagging

The rapidly growing biomedical literature has been a challenging target for natural language processing algorithms. One of the tasks these algorithms focus on is called named entity recognition (NER), often employed to tag gene mentions. Here we describe a new approach for this task, an approach that uses graphbased semi-supervised learning to train a Conditional Random Field (CRF) model. Bench...

متن کامل

Scientific Information Extraction with Semi-supervised Neural Tagging

This paper addresses the problem of extracting keyphrases from scientific articles and categorizing them as corresponding to a task, process, or material. We cast the problem as sequence tagging and introduce semi-supervised methods to a neural tagging model, which builds on recent advances in named entity recognition. Since annotated training data is scarce in this domain, we introduce a graph...

متن کامل

Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers

Most semi-supervised methods in Natural Language Processing capitalize on unannotated resources in a single language; however, information can be gained from using parallel resources in more than one language, since translations of the same utterance in different languages can help to disambiguate each other. We demonstrate a method that makes effective use of vast amounts of bilingual text (a....

متن کامل

A Joint Model to Identify and Align Bilingual Named Entities

In this article, an integrated model is derived that jointly identifies and aligns bilingual named entities (NEs) between Chinese and English. The model is motivated by the following observations: (1) whether an NE is translated semantically or phonetically depends greatly on its entity type, (2) entities within an aligned pair should share the same type, and (3) the initially detected NEs can ...

متن کامل

Semi-supervised Sequence Labeling for Named Entity Extraction based on Tri-Training: Case Study on Chinese Person Name Extraction

Named entity extraction is a fundamental task for many knowledge engineering applications. Existing studies rely on annotated training data, which is quite expensive when used to obtain large data sets, limiting the effectiveness of recognition. In this research, we propose an automatic labeling procedure to prepare training data from structured resources which contain known named entities. Whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015